Error Correction of Japanese Character-recognition in Answers to Writing-type Questions Using T5


Authors

Rina Suzuki (Tokyo University of Agriculture and Technology); Hisao Usui (Tokyo University of Agriculture and Technology); Hiroaki Ozaki (Tokyo University of Agriculture and Technology); Kanako Komiya (Tokyo University of Agriculture and Technology)*; Hung T Nguyen (Tokyo University of Agriculture and Technology); Tsunenori Ishioka (National Center for University Entrance Examinations); Masaki Nakagawa (Tokyo University of Agriculture and Technology)
s248289x@st.go.tuat.ac.jp; h-usui@st.go.tuat.ac.jp; hiroaki-ozaki@st.go.tuat.ac.jp; kkomiya@go.tuat.ac.jp*; fx7297@go.tuat.ac.jp; tunenori@rd.dnc.ac.jp; nakagawa@cc.tuat.ac.jp

Abstract

This paper proposes a method for correcting character-recognition errors in Japanese handwritten answers to writing-type questions from exercise books. We created a model to correct character-recognition errors by fine-tuning the text-to-text-transfer-transformer (T5) using pairs of automatically recognized data from handwritten answers and their manual corrections. The data comprised handwritten Japanese answers from 185 junior high school students to writing-type questions in a Japanese language task. In addition, we augmented the training data using the five best results of the character-recognition model with confidence scores to learn additional patterns of recognition errors. The experimental results revealed that the answers corrected by the proposed method were closer to the actual answers than those before the correction and data augmentation was effective for the correction model.